Dataset statistics
| Number of variables | 29 |
|---|---|
| Number of observations | 3096313 |
| Missing cells | 12139874 |
| Missing cells (%) | 13.5% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 740.9 MiB |
| Average record size in memory | 250.9 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 19 |
| DateTime | 1 |
i94yr has constant value "2016.0" | Constant |
i94mon has constant value "4.0" | Constant |
count has constant value "1.0" | Constant |
matflag has constant value "M" | Constant |
updated_at has constant value "2022-12-10 21:24:30.135479" | Constant |
i94port has a high cardinality: 299 distinct values | High cardinality |
i94addr has a high cardinality: 457 distinct values | High cardinality |
visapost has a high cardinality: 530 distinct values | High cardinality |
occup has a high cardinality: 111 distinct values | High cardinality |
dtaddto has a high cardinality: 777 distinct values | High cardinality |
insnum has a high cardinality: 1913 distinct values | High cardinality |
airline has a high cardinality: 534 distinct values | High cardinality |
fltno has a high cardinality: 7152 distinct values | High cardinality |
cicid is highly overall correlated with arrdate and 1 other fields | High correlation |
i94cit is highly overall correlated with i94res and 2 other fields | High correlation |
i94res is highly overall correlated with i94cit and 2 other fields | High correlation |
arrdate is highly overall correlated with cicid | High correlation |
depdate is highly overall correlated with arrdate | High correlation |
i94bir is highly overall correlated with biryear | High correlation |
biryear is highly overall correlated with i94bir | High correlation |
entdepa is highly overall correlated with i94mode and 3 other fields | High correlation |
visatype is highly overall correlated with i94cit and 5 other fields | High correlation |
entdepu is highly overall correlated with entdepd | High correlation |
i94visa is highly overall correlated with visatype | High correlation |
i94mode is highly overall correlated with entdepa and 1 other fields | High correlation |
entdepd is highly overall correlated with i94mode and 3 other fields | High correlation |
admnum is highly overall correlated with cicid and 5 other fields | High correlation |
i94addr has 152592 (4.9%) missing values | Missing |
depdate has 142457 (4.6%) missing values | Missing |
visapost has 1881250 (60.8%) missing values | Missing |
occup has 3088187 (99.7%) missing values | Missing |
entdepd has 138429 (4.5%) missing values | Missing |
entdepu has 3095921 (> 99.9%) missing values | Missing |
matflag has 138429 (4.5%) missing values | Missing |
gender has 414269 (13.4%) missing values | Missing |
insnum has 2982605 (96.3%) missing values | Missing |
airline has 83627 (2.7%) missing values | Missing |
depdate is highly skewed (γ1 = 301.4912075) | Skewed |
dtadfile is highly skewed (γ1 = -62.60338146) | Skewed |
cicid has unique values | Unique |
Reproduction
| Analysis started | 2022-12-10 13:31:27.586910 |
|---|---|
| Analysis finished | 2022-12-10 13:39:31.014372 |
| Duration | 8 minutes and 3.43 seconds |
| Software version | pandas-profiling vv3.5.0 |
| Download configuration | config.json |
| Distinct | 3096313 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3078651.9 |
| Minimum | 6 |
|---|---|
| Maximum | 6102785 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 280126.6 |
| Q1 | 1577790 |
| median | 3103507 |
| Q3 | 4654341 |
| 95-th percentile | 5770285.4 |
| Maximum | 6102785 |
| Range | 6102779 |
| Interquartile range (IQR) | 3076551 |
Descriptive statistics
| Standard deviation | 1763278.1 |
|---|---|
| Coefficient of variation (CV) | 0.57274358 |
| Kurtosis | -1.186781 |
| Mean | 3078651.9 |
| Median Absolute Deviation (MAD) | 1528755 |
| Skewness | -0.018431938 |
| Sum | 9.5324698 × 1012 |
| Variance | 3.1091497 × 1012 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4904480 | 1 | < 0.1% |
| 1540172 | 1 | < 0.1% |
| 1540156 | 1 | < 0.1% |
| 1540157 | 1 | < 0.1% |
| 1540158 | 1 | < 0.1% |
| 1540159 | 1 | < 0.1% |
| 1540160 | 1 | < 0.1% |
| 1540161 | 1 | < 0.1% |
| 1540168 | 1 | < 0.1% |
| 1540169 | 1 | < 0.1% |
| Other values (3096303) | 3096303 |
| Value | Count | Frequency (%) |
| 6 | 1 | |
| 7 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 17 | 1 | |
| 18 | 1 | |
| 19 | 1 | |
| 20 | 1 | |
| 21 | 1 | |
| 22 | 1 |
| Value | Count | Frequency (%) |
| 6102785 | 1 | |
| 6101166 | 1 | |
| 6101165 | 1 | |
| 6101164 | 1 | |
| 6101163 | 1 | |
| 6101162 | 1 | |
| 6101161 | 1 | |
| 6101160 | 1 | |
| 6101159 | 1 | |
| 6101158 | 1 |
i94yr
Categorical
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.2 MiB |
| 2016.0 |
|---|
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 18577878 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2016.0 |
|---|---|
| 2nd row | 2016.0 |
| 3rd row | 2016.0 |
| 4th row | 2016.0 |
| 5th row | 2016.0 |
Common Values
| Value | Count | Frequency (%) |
| 2016.0 | 3096313 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2016.0 | 3096313 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 6192626 | |
| 2 | 3096313 | |
| 1 | 3096313 | |
| 6 | 3096313 | |
| . | 3096313 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 15481565 | |
| Other Punctuation | 3096313 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 6192626 | |
| 2 | 3096313 | |
| 1 | 3096313 | |
| 6 | 3096313 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3096313 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 18577878 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 6192626 | |
| 2 | 3096313 | |
| 1 | 3096313 | |
| 6 | 3096313 | |
| . | 3096313 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 18577878 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 6192626 | |
| 2 | 3096313 | |
| 1 | 3096313 | |
| 6 | 3096313 | |
| . | 3096313 |
i94mon
Categorical
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.2 MiB |
| 4.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9288939 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 4.0 |
|---|---|
| 2nd row | 4.0 |
| 3rd row | 4.0 |
| 4th row | 4.0 |
| 5th row | 4.0 |
Common Values
| Value | Count | Frequency (%) |
| 4.0 | 3096313 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 4.0 | 3096313 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4 | 3096313 | |
| . | 3096313 | |
| 0 | 3096313 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6192626 | |
| Other Punctuation | 3096313 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 3096313 | |
| 0 | 3096313 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3096313 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9288939 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 4 | 3096313 | |
| . | 3096313 | |
| 0 | 3096313 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9288939 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 4 | 3096313 | |
| . | 3096313 | |
| 0 | 3096313 |
i94cit
Real number (ℝ)
| Distinct | 243 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 304.90693 |
| Minimum | 101 |
|---|---|
| Maximum | 999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 101 |
|---|---|
| 5-th percentile | 111 |
| Q1 | 135 |
| median | 213 |
| Q3 | 512 |
| 95-th percentile | 691 |
| Maximum | 999 |
| Range | 898 |
| Interquartile range (IQR) | 377 |
Descriptive statistics
| Standard deviation | 210.02689 |
|---|---|
| Coefficient of variation (CV) | 0.68882293 |
| Kurtosis | -0.80809704 |
| Mean | 304.90693 |
| Median Absolute Deviation (MAD) | 84 |
| Skewness | 0.87415268 |
| Sum | 9.4408730 × 108 |
| Variance | 44111.294 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 135 | 360157 | 11.6% |
| 209 | 206873 | 6.7% |
| 245 | 191425 | 6.2% |
| 111 | 188766 | 6.1% |
| 582 | 175781 | 5.7% |
| 148 | 157806 | 5.1% |
| 254 | 137735 | 4.4% |
| 689 | 129833 | 4.2% |
| 213 | 110691 | 3.6% |
| 438 | 109884 | 3.5% |
| Other values (233) | 1327362 |
| Value | Count | Frequency (%) |
| 101 | 828 | < 0.1% |
| 102 | 82 | < 0.1% |
| 103 | 16136 | 0.5% |
| 104 | 20359 | 0.7% |
| 105 | 2571 | 0.1% |
| 107 | 17027 | 0.5% |
| 108 | 24797 | 0.8% |
| 109 | 2108 | 0.1% |
| 110 | 11954 | 0.4% |
| 111 | 188766 |
| Value | Count | Frequency (%) |
| 999 | 894 | |
| 770 | 2 | < 0.1% |
| 769 | 2 | < 0.1% |
| 766 | 61 | < 0.1% |
| 765 | 1 | < 0.1% |
| 764 | 3 | < 0.1% |
| 763 | 1 | < 0.1% |
| 760 | 1 | < 0.1% |
| 756 | 846 | |
| 752 | 56 | < 0.1% |
i94res
Real number (ℝ)
| Distinct | 229 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 303.28382 |
| Minimum | 101 |
|---|---|
| Maximum | 760 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 101 |
|---|---|
| 5-th percentile | 111 |
| Q1 | 131 |
| median | 213 |
| Q3 | 504 |
| 95-th percentile | 691 |
| Maximum | 760 |
| Range | 659 |
| Interquartile range (IQR) | 373 |
Descriptive statistics
| Standard deviation | 208.58321 |
|---|---|
| Coefficient of variation (CV) | 0.68774923 |
| Kurtosis | -0.85653379 |
| Mean | 303.28382 |
| Median Absolute Deviation (MAD) | 90 |
| Skewness | 0.84355474 |
| Sum | 9.3906163 × 108 |
| Variance | 43506.957 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 135 | 368421 | 11.9% |
| 209 | 249167 | 8.0% |
| 245 | 185609 | 6.0% |
| 111 | 185339 | 6.0% |
| 582 | 179603 | 5.8% |
| 112 | 156613 | 5.1% |
| 276 | 136312 | 4.4% |
| 689 | 134907 | 4.4% |
| 438 | 112407 | 3.6% |
| 213 | 107193 | 3.5% |
| Other values (219) | 1280742 |
| Value | Count | Frequency (%) |
| 101 | 929 | < 0.1% |
| 102 | 117 | < 0.1% |
| 103 | 15465 | 0.5% |
| 104 | 20796 | 0.7% |
| 105 | 2343 | 0.1% |
| 107 | 16153 | 0.5% |
| 108 | 24600 | 0.8% |
| 109 | 1983 | 0.1% |
| 110 | 11545 | 0.4% |
| 111 | 185339 |
| Value | Count | Frequency (%) |
| 760 | 2 | < 0.1% |
| 749 | 167 | < 0.1% |
| 748 | 24 | < 0.1% |
| 745 | 2113 | |
| 743 | 558 | < 0.1% |
| 736 | 12 | < 0.1% |
| 735 | 423 | < 0.1% |
| 732 | 358 | < 0.1% |
| 723 | 27 | < 0.1% |
| 721 | 551 | < 0.1% |
i94port
Categorical
| Distinct | 299 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.2 MiB |
| NYC | |
|---|---|
| MIA | |
| LOS | |
| SFR | 152586 |
| ORL | 149195 |
| Other values (294) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9288939 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 19 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | CHI |
|---|---|
| 2nd row | CHI |
| 3rd row | CHI |
| 4th row | CHI |
| 5th row | CHI |
Common Values
| Value | Count | Frequency (%) |
| NYC | 485916 | |
| MIA | 343941 | 11.1% |
| LOS | 310163 | 10.0% |
| SFR | 152586 | 4.9% |
| ORL | 149195 | 4.8% |
| HHW | 142720 | 4.6% |
| NEW | 136122 | 4.4% |
| CHI | 130564 | 4.2% |
| HOU | 101481 | 3.3% |
| FTL | 95977 | 3.1% |
| Other values (289) | 1047648 |
Length
| Value | Count | Frequency (%) |
| nyc | 485916 | |
| mia | 343941 | 11.1% |
| los | 310163 | 10.0% |
| sfr | 152586 | 4.9% |
| orl | 149195 | 4.8% |
| hhw | 142720 | 4.6% |
| new | 136122 | 4.4% |
| chi | 130564 | 4.2% |
| hou | 101481 | 3.3% |
| ftl | 95977 | 3.1% |
| Other values (289) | 1047648 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 962689 | 10.4% |
| L | 864662 | 9.3% |
| S | 787316 | 8.5% |
| O | 728291 | 7.8% |
| N | 695763 | 7.5% |
| C | 673005 | 7.2% |
| H | 604114 | 6.5% |
| I | 538120 | 5.8% |
| Y | 510131 | 5.5% |
| M | 444432 | 4.8% |
| Other values (21) | 2480416 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 9284083 | |
| Decimal Number | 4856 | 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 962689 | 10.4% |
| L | 864662 | 9.3% |
| S | 787316 | 8.5% |
| O | 728291 | 7.8% |
| N | 695763 | 7.5% |
| C | 673005 | 7.2% |
| H | 604114 | 6.5% |
| I | 538120 | 5.8% |
| Y | 510131 | 5.5% |
| M | 444432 | 4.8% |
| Other values (16) | 2475560 |
Decimal Number
| Value | Count | Frequency (%) |
| 6 | 2382 | |
| 9 | 2378 | |
| 5 | 73 | 1.5% |
| 4 | 22 | 0.5% |
| 8 | 1 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9284083 | |
| Common | 4856 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 962689 | 10.4% |
| L | 864662 | 9.3% |
| S | 787316 | 8.5% |
| O | 728291 | 7.8% |
| N | 695763 | 7.5% |
| C | 673005 | 7.2% |
| H | 604114 | 6.5% |
| I | 538120 | 5.8% |
| Y | 510131 | 5.5% |
| M | 444432 | 4.8% |
| Other values (16) | 2475560 |
Common
| Value | Count | Frequency (%) |
| 6 | 2382 | |
| 9 | 2378 | |
| 5 | 73 | 1.5% |
| 4 | 22 | 0.5% |
| 8 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9288939 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 962689 | 10.4% |
| L | 864662 | 9.3% |
| S | 787316 | 8.5% |
| O | 728291 | 7.8% |
| N | 695763 | 7.5% |
| C | 673005 | 7.2% |
| H | 604114 | 6.5% |
| I | 538120 | 5.8% |
| Y | 510131 | 5.5% |
| M | 444432 | 4.8% |
| Other values (21) | 2480416 |
arrdate
Real number (ℝ)
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20559.849 |
| Minimum | 20545 |
|---|---|
| Maximum | 20574 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 20545 |
|---|---|
| 5-th percentile | 20546 |
| Q1 | 20552 |
| median | 20560 |
| Q3 | 20567 |
| 95-th percentile | 20573 |
| Maximum | 20574 |
| Range | 29 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.7773395 |
|---|---|
| Coefficient of variation (CV) | 0.00042691654 |
| Kurtosis | -1.2040296 |
| Mean | 20559.849 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -0.036326873 |
| Sum | 6.3659726 × 1010 |
| Variance | 77.041688 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20573 | 128267 | 4.1% |
| 20574 | 127155 | 4.1% |
| 20572 | 120971 | 3.9% |
| 20560 | 114970 | 3.7% |
| 20559 | 114803 | 3.7% |
| 20567 | 112883 | 3.6% |
| 20566 | 110304 | 3.6% |
| 20545 | 108407 | 3.5% |
| 20558 | 107557 | 3.5% |
| 20561 | 106474 | 3.4% |
| Other values (20) | 1944522 |
| Value | Count | Frequency (%) |
| 20545 | 108407 | |
| 20546 | 103196 | |
| 20547 | 99972 | |
| 20548 | 97653 | |
| 20549 | 91514 | |
| 20550 | 88273 | |
| 20551 | 99763 | |
| 20552 | 103660 | |
| 20553 | 105930 | |
| 20554 | 104394 |
| Value | Count | Frequency (%) |
| 20574 | 127155 | |
| 20573 | 128267 | |
| 20572 | 120971 | |
| 20571 | 99259 | |
| 20570 | 88100 | |
| 20569 | 99652 | |
| 20568 | 100203 | |
| 20567 | 112883 | |
| 20566 | 110304 | |
| 20565 | 105454 |
i94mode
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 239 |
| Missing (%) | < 0.1% |
| Memory size | 47.2 MiB |
| 1.0 | |
|---|---|
| 3.0 | 66660 |
| 2.0 | 26349 |
| 9.0 | 8560 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9288222 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 2994505 | |
| 3.0 | 66660 | 2.2% |
| 2.0 | 26349 | 0.9% |
| 9.0 | 8560 | 0.3% |
| (Missing) | 239 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 2994505 | |
| 3.0 | 66660 | 2.2% |
| 2.0 | 26349 | 0.9% |
| 9.0 | 8560 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 3096074 | |
| 0 | 3096074 | |
| 1 | 2994505 | |
| 3 | 66660 | 0.7% |
| 2 | 26349 | 0.3% |
| 9 | 8560 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6192148 | |
| Other Punctuation | 3096074 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 3096074 | |
| 1 | 2994505 | |
| 3 | 66660 | 1.1% |
| 2 | 26349 | 0.4% |
| 9 | 8560 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3096074 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9288222 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 3096074 | |
| 0 | 3096074 | |
| 1 | 2994505 | |
| 3 | 66660 | 0.7% |
| 2 | 26349 | 0.3% |
| 9 | 8560 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9288222 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 3096074 | |
| 0 | 3096074 | |
| 1 | 2994505 | |
| 3 | 66660 | 0.7% |
| 2 | 26349 | 0.3% |
| 9 | 8560 | 0.1% |
| Distinct | 457 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 152592 |
| Missing (%) | 4.9% |
| Memory size | 47.2 MiB |
| FL | |
|---|---|
| NY | |
| CA | |
| HI | |
| TX | |
| Other values (452) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 1.9999541 |
| Min length | 1 |
Characters and Unicode
| Total characters | 5887307 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 123 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | VA |
|---|---|
| 2nd row | VA |
| 3rd row | VA |
| 4th row | WA |
| 5th row | DE |
Common Values
| Value | Count | Frequency (%) |
| FL | 621701 | |
| NY | 553677 | |
| CA | 470386 | |
| HI | 168764 | 5.5% |
| TX | 134321 | 4.3% |
| NV | 114609 | 3.7% |
| GU | 94107 | 3.0% |
| IL | 82126 | 2.7% |
| NJ | 76531 | 2.5% |
| MA | 70486 | 2.3% |
| Other values (447) | 557013 | |
| (Missing) | 152592 | 4.9% |
Length
| Value | Count | Frequency (%) |
| fl | 621701 | |
| ny | 553677 | |
| ca | 470386 | |
| hi | 168764 | 5.7% |
| tx | 134321 | 4.6% |
| nv | 114609 | 3.9% |
| gu | 94107 | 3.2% |
| il | 82126 | 2.8% |
| nj | 76531 | 2.6% |
| ma | 70486 | 2.4% |
| Other values (437) | 557013 |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 837957 | |
| A | 763519 | |
| L | 735142 | |
| F | 622101 | |
| C | 561997 | |
| Y | 560039 | |
| I | 311030 | 5.3% |
| H | 190789 | 3.2% |
| T | 172478 | 2.9% |
| M | 159747 | 2.7% |
| Other values (27) | 972508 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 5886932 | |
| Decimal Number | 268 | < 0.1% |
| Other Punctuation | 107 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 837957 | |
| A | 763519 | |
| L | 735142 | |
| F | 622101 | |
| C | 561997 | |
| Y | 560039 | |
| I | 311030 | 5.3% |
| H | 190789 | 3.2% |
| T | 172478 | 2.9% |
| M | 159747 | 2.7% |
| Other values (16) | 972133 |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 111 | |
| 0 | 56 | |
| 1 | 31 | 11.6% |
| 3 | 22 | 8.2% |
| 2 | 18 | 6.7% |
| 7 | 11 | 4.1% |
| 6 | 6 | 2.2% |
| 5 | 5 | 1.9% |
| 4 | 4 | 1.5% |
| 8 | 4 | 1.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 107 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5886932 | |
| Common | 375 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 837957 | |
| A | 763519 | |
| L | 735142 | |
| F | 622101 | |
| C | 561997 | |
| Y | 560039 | |
| I | 311030 | 5.3% |
| H | 190789 | 3.2% |
| T | 172478 | 2.9% |
| M | 159747 | 2.7% |
| Other values (16) | 972133 |
Common
| Value | Count | Frequency (%) |
| 9 | 111 | |
| . | 107 | |
| 0 | 56 | |
| 1 | 31 | 8.3% |
| 3 | 22 | 5.9% |
| 2 | 18 | 4.8% |
| 7 | 11 | 2.9% |
| 6 | 6 | 1.6% |
| 5 | 5 | 1.3% |
| 4 | 4 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5887307 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 837957 | |
| A | 763519 | |
| L | 735142 | |
| F | 622101 | |
| C | 561997 | |
| Y | 560039 | |
| I | 311030 | 5.3% |
| H | 190789 | 3.2% |
| T | 172478 | 2.9% |
| M | 159747 | 2.7% |
| Other values (27) | 972508 |
| Distinct | 235 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 142457 |
| Missing (%) | 4.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20573.953 |
| Minimum | 15176 |
|---|---|
| Maximum | 45427 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 15176 |
|---|---|
| 5-th percentile | 20552 |
| Q1 | 20561 |
| median | 20570 |
| Q3 | 20579 |
| 95-th percentile | 20615 |
| Maximum | 45427 |
| Range | 30251 |
| Interquartile range (IQR) | 18 |
Descriptive statistics
| Standard deviation | 29.356968 |
|---|---|
| Coefficient of variation (CV) | 0.0014268998 |
| Kurtosis | 238342.98 |
| Mean | 20573.953 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | 301.49121 |
| Sum | 6.0772494 × 1010 |
| Variance | 861.8316 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20574 | 102689 | 3.3% |
| 20575 | 102510 | 3.3% |
| 20573 | 98048 | 3.2% |
| 20567 | 95936 | 3.1% |
| 20568 | 94075 | 3.0% |
| 20566 | 92317 | 3.0% |
| 20561 | 88298 | 2.9% |
| 20560 | 87819 | 2.8% |
| 20576 | 87417 | 2.8% |
| 20559 | 85894 | 2.8% |
| Other values (225) | 2018853 | |
| (Missing) | 142457 | 4.6% |
| Value | Count | Frequency (%) |
| 15176 | 1 | < 0.1% |
| 19095 | 5 | |
| 19097 | 1 | < 0.1% |
| 19835 | 1 | < 0.1% |
| 19837 | 1 | < 0.1% |
| 19860 | 1 | < 0.1% |
| 20181 | 1 | < 0.1% |
| 20186 | 1 | < 0.1% |
| 20194 | 1 | < 0.1% |
| 20196 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 45427 | 1 | < 0.1% |
| 39935 | 1 | < 0.1% |
| 21919 | 1 | < 0.1% |
| 20716 | 156 | < 0.1% |
| 20715 | 997 | |
| 20714 | 1137 | |
| 20713 | 794 | |
| 20712 | 1109 | |
| 20711 | 870 | |
| 20710 | 668 |
i94bir
Real number (ℝ)
| Distinct | 112 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 802 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 41.767614 |
| Minimum | -3 |
|---|---|
| Maximum | 114 |
| Zeros | 765 |
| Zeros (%) | < 0.1% |
| Negative | 1 |
| Negative (%) | < 0.1% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | -3 |
|---|---|
| 5-th percentile | 11 |
| Q1 | 30 |
| median | 41 |
| Q3 | 54 |
| 95-th percentile | 70 |
| Maximum | 114 |
| Range | 117 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 17.420261 |
|---|---|
| Coefficient of variation (CV) | 0.41707578 |
| Kurtosis | -0.42093114 |
| Mean | 41.767614 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -0.036911348 |
| Sum | 1.2929211 × 108 |
| Variance | 303.46548 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 30 | 71958 | 2.3% |
| 33 | 70415 | 2.3% |
| 31 | 70409 | 2.3% |
| 34 | 70251 | 2.3% |
| 32 | 69809 | 2.3% |
| 35 | 69626 | 2.2% |
| 36 | 67960 | 2.2% |
| 29 | 67762 | 2.2% |
| 40 | 66568 | 2.1% |
| 37 | 66494 | 2.1% |
| Other values (102) | 2404259 |
| Value | Count | Frequency (%) |
| -3 | 1 | < 0.1% |
| 0 | 765 | < 0.1% |
| 1 | 12747 | |
| 2 | 14756 | |
| 3 | 12704 | |
| 4 | 14411 | |
| 5 | 15129 | |
| 6 | 15773 | |
| 7 | 14233 | |
| 8 | 14607 |
| Value | Count | Frequency (%) |
| 114 | 1 | < 0.1% |
| 111 | 1 | < 0.1% |
| 110 | 1 | < 0.1% |
| 109 | 2 | |
| 108 | 2 | |
| 107 | 1 | < 0.1% |
| 105 | 2 | |
| 103 | 1 | < 0.1% |
| 102 | 4 | |
| 101 | 2 |
i94visa
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.2 MiB |
| 2.0 | |
|---|---|
| 1.0 | |
| 3.0 | 43366 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9288939 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 2.0 |
| 3rd row | 2.0 |
| 4th row | 1.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 2530868 | |
| 1.0 | 522079 | 16.9% |
| 3.0 | 43366 | 1.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 2530868 | |
| 1.0 | 522079 | 16.9% |
| 3.0 | 43366 | 1.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 3096313 | |
| 0 | 3096313 | |
| 2 | 2530868 | |
| 1 | 522079 | 5.6% |
| 3 | 43366 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6192626 | |
| Other Punctuation | 3096313 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 3096313 | |
| 2 | 2530868 | |
| 1 | 522079 | 8.4% |
| 3 | 43366 | 0.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3096313 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9288939 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 3096313 | |
| 0 | 3096313 | |
| 2 | 2530868 | |
| 1 | 522079 | 5.6% |
| 3 | 43366 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9288939 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 3096313 | |
| 0 | 3096313 | |
| 2 | 2530868 | |
| 1 | 522079 | 5.6% |
| 3 | 43366 | 0.5% |
count
Categorical
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.2 MiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9288939 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 3096313 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 3096313 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 3096313 | |
| . | 3096313 | |
| 0 | 3096313 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6192626 | |
| Other Punctuation | 3096313 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 3096313 | |
| 0 | 3096313 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3096313 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9288939 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 3096313 | |
| . | 3096313 | |
| 0 | 3096313 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9288939 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 3096313 | |
| . | 3096313 | |
| 0 | 3096313 |
dtadfile
Real number (ℝ)
| Distinct | 117 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20160425 |
| Minimum | 20130811 |
|---|---|
| Maximum | 20160919 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 20130811 |
|---|---|
| 5-th percentile | 20160402 |
| Q1 | 20160409 |
| median | 20160417 |
| Q3 | 20160424 |
| 95-th percentile | 20160430 |
| Maximum | 20160919 |
| Range | 30108 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 50.015134 |
|---|---|
| Coefficient of variation (CV) | 2.4808572 × 10-6 |
| Kurtosis | 39717.382 |
| Mean | 20160425 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -62.603381 |
| Sum | 6.2422965 × 1013 |
| Variance | 2501.5137 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20160430 | 125570 | 4.1% |
| 20160429 | 120497 | 3.9% |
| 20160417 | 119296 | 3.9% |
| 20160428 | 116601 | 3.8% |
| 20160415 | 109746 | 3.5% |
| 20160423 | 107079 | 3.5% |
| 20160414 | 106306 | 3.4% |
| 20160422 | 105575 | 3.4% |
| 20160401 | 103231 | 3.3% |
| 20160409 | 102239 | 3.3% |
| Other values (107) | 1980172 |
| Value | Count | Frequency (%) |
| 20130811 | 1 | < 0.1% |
| 20160401 | 103231 | |
| 20160402 | 98915 | |
| 20160403 | 94852 | |
| 20160404 | 94511 | |
| 20160405 | 86195 | |
| 20160406 | 85177 | |
| 20160407 | 95901 | |
| 20160408 | 99181 | |
| 20160409 | 102239 |
| Value | Count | Frequency (%) |
| 20160919 | 1 | < 0.1% |
| 20160918 | 14 | < 0.1% |
| 20160917 | 22 | < 0.1% |
| 20160916 | 18 | < 0.1% |
| 20160915 | 3 | < 0.1% |
| 20160914 | 8 | < 0.1% |
| 20160913 | 5 | < 0.1% |
| 20160909 | 125 | |
| 20160906 | 7 | < 0.1% |
| 20160902 | 9 | < 0.1% |
| Distinct | 530 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1881250 |
| Missing (%) | 60.8% |
| Memory size | 47.2 MiB |
| MEX | 84720 |
|---|---|
| SPL | 65678 |
| BNS | 62032 |
| GUZ | 48298 |
| BGT | 46074 |
| Other values (525) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3645189 |
|---|---|
| Distinct characters | 27 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 62 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | BEJ |
|---|---|
| 2nd row | BEJ |
| 3rd row | BEJ |
| 4th row | BEJ |
| 5th row | BEJ |
Common Values
| Value | Count | Frequency (%) |
| MEX | 84720 | 2.7% |
| SPL | 65678 | 2.1% |
| BNS | 62032 | 2.0% |
| GUZ | 48298 | 1.6% |
| BGT | 46074 | 1.5% |
| CRS | 37137 | 1.2% |
| BEJ | 36703 | 1.2% |
| SHG | 35507 | 1.1% |
| GDL | 30970 | 1.0% |
| RDJ | 29943 | 1.0% |
| Other values (520) | 738001 | 23.8% |
| (Missing) | 1881250 |
Length
| Value | Count | Frequency (%) |
| mex | 84720 | 7.0% |
| spl | 65678 | 5.4% |
| bns | 62032 | 5.1% |
| guz | 48298 | 4.0% |
| bgt | 46074 | 3.8% |
| crs | 37137 | 3.1% |
| bej | 36703 | 3.0% |
| shg | 35507 | 2.9% |
| gdl | 30970 | 2.5% |
| rdj | 29943 | 2.5% |
| Other values (520) | 738001 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 390557 | 10.7% |
| M | 282799 | 7.8% |
| G | 279336 | 7.7% |
| B | 276638 | 7.6% |
| N | 259522 | 7.1% |
| T | 237480 | 6.5% |
| L | 227298 | 6.2% |
| R | 204924 | 5.6% |
| D | 196076 | 5.4% |
| E | 166542 | 4.6% |
| Other values (17) | 1124017 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 3643377 | |
| Decimal Number | 1812 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 390557 | 10.7% |
| M | 282799 | 7.8% |
| G | 279336 | 7.7% |
| B | 276638 | 7.6% |
| N | 259522 | 7.1% |
| T | 237480 | 6.5% |
| L | 227298 | 6.2% |
| R | 204924 | 5.6% |
| D | 196076 | 5.4% |
| E | 166542 | 4.6% |
| Other values (16) | 1122205 |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 1812 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3643377 | |
| Common | 1812 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 390557 | 10.7% |
| M | 282799 | 7.8% |
| G | 279336 | 7.7% |
| B | 276638 | 7.6% |
| N | 259522 | 7.1% |
| T | 237480 | 6.5% |
| L | 227298 | 6.2% |
| R | 204924 | 5.6% |
| D | 196076 | 5.4% |
| E | 166542 | 4.6% |
| Other values (16) | 1122205 |
Common
| Value | Count | Frequency (%) |
| 9 | 1812 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3645189 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 390557 | 10.7% |
| M | 282799 | 7.8% |
| G | 279336 | 7.7% |
| B | 276638 | 7.6% |
| N | 259522 | 7.1% |
| T | 237480 | 6.5% |
| L | 227298 | 6.2% |
| R | 204924 | 5.6% |
| D | 196076 | 5.4% |
| E | 166542 | 4.6% |
| Other values (17) | 1124017 |
| Distinct | 111 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 3088187 |
| Missing (%) | 99.7% |
| Memory size | 47.2 MiB |
| STU | |
|---|---|
| OTH | |
| NRR | 345 |
| MKT | 280 |
| EXA | 196 |
| Other values (106) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 24378 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 15 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | STU |
|---|---|
| 2nd row | STU |
| 3rd row | STU |
| 4th row | NRR |
| 5th row | STU |
Common Values
| Value | Count | Frequency (%) |
| STU | 4719 | 0.2% |
| OTH | 661 | < 0.1% |
| NRR | 345 | < 0.1% |
| MKT | 280 | < 0.1% |
| EXA | 196 | < 0.1% |
| GLS | 189 | < 0.1% |
| ULS | 175 | < 0.1% |
| ADM | 125 | < 0.1% |
| TIE | 124 | < 0.1% |
| MVC | 110 | < 0.1% |
| Other values (101) | 1202 | < 0.1% |
| (Missing) | 3088187 |
Length
| Value | Count | Frequency (%) |
| stu | 4719 | |
| oth | 661 | 8.1% |
| nrr | 345 | 4.2% |
| mkt | 280 | 3.4% |
| exa | 196 | 2.4% |
| gls | 189 | 2.3% |
| uls | 175 | 2.2% |
| adm | 125 | 1.5% |
| tie | 124 | 1.5% |
| mvc | 110 | 1.4% |
| Other values (101) | 1202 | 14.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| T | 6150 | |
| S | 5241 | |
| U | 4944 | |
| H | 893 | 3.7% |
| R | 873 | 3.6% |
| O | 803 | 3.3% |
| E | 727 | 3.0% |
| M | 691 | 2.8% |
| L | 546 | 2.2% |
| N | 542 | 2.2% |
| Other values (22) | 2968 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 24227 | |
| Decimal Number | 151 | 0.6% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 6150 | |
| S | 5241 | |
| U | 4944 | |
| H | 893 | 3.7% |
| R | 873 | 3.6% |
| O | 803 | 3.3% |
| E | 727 | 3.0% |
| M | 691 | 2.9% |
| L | 546 | 2.3% |
| N | 542 | 2.2% |
| Other values (14) | 2817 |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 85 | |
| 1 | 33 | 21.9% |
| 5 | 14 | 9.3% |
| 8 | 7 | 4.6% |
| 0 | 5 | 3.3% |
| 2 | 4 | 2.6% |
| 4 | 2 | 1.3% |
| 3 | 1 | 0.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 24227 | |
| Common | 151 | 0.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| T | 6150 | |
| S | 5241 | |
| U | 4944 | |
| H | 893 | 3.7% |
| R | 873 | 3.6% |
| O | 803 | 3.3% |
| E | 727 | 3.0% |
| M | 691 | 2.9% |
| L | 546 | 2.3% |
| N | 542 | 2.2% |
| Other values (14) | 2817 |
Common
| Value | Count | Frequency (%) |
| 9 | 85 | |
| 1 | 33 | 21.9% |
| 5 | 14 | 9.3% |
| 8 | 7 | 4.6% |
| 0 | 5 | 3.3% |
| 2 | 4 | 2.6% |
| 4 | 2 | 1.3% |
| 3 | 1 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 24378 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| T | 6150 | |
| S | 5241 | |
| U | 4944 | |
| H | 893 | 3.7% |
| R | 873 | 3.6% |
| O | 803 | 3.3% |
| E | 727 | 3.0% |
| M | 691 | 2.8% |
| L | 546 | 2.2% |
| N | 542 | 2.2% |
| Other values (22) | 2968 |
entdepa
Categorical
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 238 |
| Missing (%) | < 0.1% |
| Memory size | 47.2 MiB |
| G | |
|---|---|
| O | |
| A | 108560 |
| Z | 64864 |
| T | 61144 |
| Other values (8) | 48868 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 3096075 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | G |
|---|---|
| 2nd row | G |
| 3rd row | G |
| 4th row | G |
| 5th row | G |
Common Values
| Value | Count | Frequency (%) |
| G | 2399582 | |
| O | 413057 | 13.3% |
| A | 108560 | 3.5% |
| Z | 64864 | 2.1% |
| T | 61144 | 2.0% |
| K | 17076 | 0.6% |
| P | 14397 | 0.5% |
| H | 14341 | 0.5% |
| U | 2371 | 0.1% |
| B | 401 | < 0.1% |
| Other values (3) | 282 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| g | 2399582 | |
| o | 413057 | 13.3% |
| a | 108560 | 3.5% |
| z | 64864 | 2.1% |
| t | 61144 | 2.0% |
| k | 17076 | 0.6% |
| p | 14397 | 0.5% |
| h | 14341 | 0.5% |
| u | 2371 | 0.1% |
| b | 401 | < 0.1% |
| Other values (3) | 282 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 2399582 | |
| O | 413057 | 13.3% |
| A | 108560 | 3.5% |
| Z | 64864 | 2.1% |
| T | 61144 | 2.0% |
| K | 17076 | 0.6% |
| P | 14397 | 0.5% |
| H | 14341 | 0.5% |
| U | 2371 | 0.1% |
| B | 401 | < 0.1% |
| Other values (3) | 282 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 3096075 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 2399582 | |
| O | 413057 | 13.3% |
| A | 108560 | 3.5% |
| Z | 64864 | 2.1% |
| T | 61144 | 2.0% |
| K | 17076 | 0.6% |
| P | 14397 | 0.5% |
| H | 14341 | 0.5% |
| U | 2371 | 0.1% |
| B | 401 | < 0.1% |
| Other values (3) | 282 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3096075 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 2399582 | |
| O | 413057 | 13.3% |
| A | 108560 | 3.5% |
| Z | 64864 | 2.1% |
| T | 61144 | 2.0% |
| K | 17076 | 0.6% |
| P | 14397 | 0.5% |
| H | 14341 | 0.5% |
| U | 2371 | 0.1% |
| B | 401 | < 0.1% |
| Other values (3) | 282 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3096075 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| G | 2399582 | |
| O | 413057 | 13.3% |
| A | 108560 | 3.5% |
| Z | 64864 | 2.1% |
| T | 61144 | 2.0% |
| K | 17076 | 0.6% |
| P | 14397 | 0.5% |
| H | 14341 | 0.5% |
| U | 2371 | 0.1% |
| B | 401 | < 0.1% |
| Other values (3) | 282 | < 0.1% |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 138429 |
| Missing (%) | 4.5% |
| Memory size | 47.2 MiB |
| O | |
|---|---|
| I | 99846 |
| D | 96518 |
| N | 76192 |
| K | 70624 |
| Other values (7) | 101072 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2957884 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | O |
|---|---|
| 2nd row | O |
| 3rd row | O |
| 4th row | O |
| 5th row | O |
Common Values
| Value | Count | Frequency (%) |
| O | 2513632 | |
| I | 99846 | 3.2% |
| D | 96518 | 3.1% |
| N | 76192 | 2.5% |
| K | 70624 | 2.3% |
| Q | 52729 | 1.7% |
| R | 41879 | 1.4% |
| W | 3887 | 0.1% |
| J | 1758 | 0.1% |
| V | 762 | < 0.1% |
| Other values (2) | 57 | < 0.1% |
| (Missing) | 138429 | 4.5% |
Length
| Value | Count | Frequency (%) |
| o | 2513632 | |
| i | 99846 | 3.4% |
| d | 96518 | 3.3% |
| n | 76192 | 2.6% |
| k | 70624 | 2.4% |
| q | 52729 | 1.8% |
| r | 41879 | 1.4% |
| w | 3887 | 0.1% |
| j | 1758 | 0.1% |
| v | 762 | < 0.1% |
| Other values (2) | 57 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| O | 2513632 | |
| I | 99846 | 3.4% |
| D | 96518 | 3.3% |
| N | 76192 | 2.6% |
| K | 70624 | 2.4% |
| Q | 52729 | 1.8% |
| R | 41879 | 1.4% |
| W | 3887 | 0.1% |
| J | 1758 | 0.1% |
| V | 762 | < 0.1% |
| Other values (2) | 57 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2957884 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 2513632 | |
| I | 99846 | 3.4% |
| D | 96518 | 3.3% |
| N | 76192 | 2.6% |
| K | 70624 | 2.4% |
| Q | 52729 | 1.8% |
| R | 41879 | 1.4% |
| W | 3887 | 0.1% |
| J | 1758 | 0.1% |
| V | 762 | < 0.1% |
| Other values (2) | 57 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2957884 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| O | 2513632 | |
| I | 99846 | 3.4% |
| D | 96518 | 3.3% |
| N | 76192 | 2.6% |
| K | 70624 | 2.4% |
| Q | 52729 | 1.8% |
| R | 41879 | 1.4% |
| W | 3887 | 0.1% |
| J | 1758 | 0.1% |
| V | 762 | < 0.1% |
| Other values (2) | 57 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2957884 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| O | 2513632 | |
| I | 99846 | 3.4% |
| D | 96518 | 3.3% |
| N | 76192 | 2.6% |
| K | 70624 | 2.4% |
| Q | 52729 | 1.8% |
| R | 41879 | 1.4% |
| W | 3887 | 0.1% |
| J | 1758 | 0.1% |
| V | 762 | < 0.1% |
| Other values (2) | 57 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 3095921 |
| Missing (%) | > 99.9% |
| Memory size | 47.2 MiB |
| U | |
|---|---|
| Y | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 392 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | U |
|---|---|
| 2nd row | U |
| 3rd row | U |
| 4th row | U |
| 5th row | U |
Common Values
| Value | Count | Frequency (%) |
| U | 391 | < 0.1% |
| Y | 1 | < 0.1% |
| (Missing) | 3095921 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| u | 391 | |
| y | 1 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| U | 391 | |
| Y | 1 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 392 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 391 | |
| Y | 1 | 0.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 392 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| U | 391 | |
| Y | 1 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 392 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| U | 391 | |
| Y | 1 | 0.3% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 138429 |
| Missing (%) | 4.5% |
| Memory size | 47.2 MiB |
| M |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2957884 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | M |
| 3rd row | M |
| 4th row | M |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| M | 2957884 | |
| (Missing) | 138429 | 4.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| m | 2957884 |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 2957884 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2957884 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 2957884 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2957884 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| M | 2957884 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2957884 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| M | 2957884 |
biryear
Real number (ℝ)
| Distinct | 112 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 802 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1974.2324 |
| Minimum | 1902 |
|---|---|
| Maximum | 2019 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 1902 |
|---|---|
| 5-th percentile | 1946 |
| Q1 | 1962 |
| median | 1975 |
| Q3 | 1986 |
| 95-th percentile | 2005 |
| Maximum | 2019 |
| Range | 117 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 17.420261 |
|---|---|
| Coefficient of variation (CV) | 0.0088238146 |
| Kurtosis | -0.42093114 |
| Mean | 1974.2324 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 0.036911348 |
| Sum | 6.1112581 × 109 |
| Variance | 303.46548 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1986 | 71958 | 2.3% |
| 1983 | 70415 | 2.3% |
| 1985 | 70409 | 2.3% |
| 1982 | 70251 | 2.3% |
| 1984 | 69809 | 2.3% |
| 1981 | 69626 | 2.2% |
| 1980 | 67960 | 2.2% |
| 1987 | 67762 | 2.2% |
| 1976 | 66568 | 2.1% |
| 1979 | 66494 | 2.1% |
| Other values (102) | 2404259 |
| Value | Count | Frequency (%) |
| 1902 | 1 | < 0.1% |
| 1905 | 1 | < 0.1% |
| 1906 | 1 | < 0.1% |
| 1907 | 2 | |
| 1908 | 2 | |
| 1909 | 1 | < 0.1% |
| 1911 | 2 | |
| 1913 | 1 | < 0.1% |
| 1914 | 4 | |
| 1915 | 2 |
| Value | Count | Frequency (%) |
| 2019 | 1 | < 0.1% |
| 2016 | 765 | < 0.1% |
| 2015 | 12747 | |
| 2014 | 14756 | |
| 2013 | 12704 | |
| 2012 | 14411 | |
| 2011 | 15129 | |
| 2010 | 15773 | |
| 2009 | 14233 | |
| 2008 | 14607 |
dtaddto
Categorical
| Distinct | 777 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 477 |
| Missing (%) | < 0.1% |
| Memory size | 47.2 MiB |
| 07282016 | 67889 |
|---|---|
| 07272016 | 64789 |
| 07152016 | 63438 |
| 07262016 | 60234 |
| 07212016 | 58358 |
| Other values (772) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.9267661 |
| Min length | 3 |
Characters and Unicode
| Total characters | 24539968 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 74 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 10252016 |
|---|---|
| 2nd row | 10252016 |
| 3rd row | 10252016 |
| 4th row | 10252016 |
| 5th row | 10252016 |
Common Values
| Value | Count | Frequency (%) |
| 07282016 | 67889 | 2.2% |
| 07272016 | 64789 | 2.1% |
| 07152016 | 63438 | 2.0% |
| 07262016 | 60234 | 1.9% |
| 07212016 | 58358 | 1.9% |
| 07072016 | 57234 | 1.8% |
| 07132016 | 56363 | 1.8% |
| 06292016 | 56349 | 1.8% |
| 06302016 | 56134 | 1.8% |
| 07082016 | 56108 | 1.8% |
| Other values (767) | 2498940 |
Length
| Value | Count | Frequency (%) |
| 07282016 | 67889 | 2.2% |
| 07272016 | 64789 | 2.1% |
| 07152016 | 63438 | 2.0% |
| 07262016 | 60234 | 1.9% |
| 07212016 | 58358 | 1.9% |
| 07072016 | 57234 | 1.8% |
| 07132016 | 56363 | 1.8% |
| 06292016 | 56349 | 1.8% |
| 06302016 | 56134 | 1.8% |
| 07082016 | 56108 | 1.8% |
| Other values (770) | 2498943 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 7262282 | |
| 1 | 5595137 | |
| 2 | 4437861 | |
| 6 | 3498236 | |
| 7 | 1798715 | 7.3% |
| 3 | 404160 | 1.6% |
| 5 | 379578 | 1.5% |
| 9 | 365010 | 1.5% |
| 8 | 349001 | 1.4% |
| 4 | 313952 | 1.3% |
| Other values (4) | 136036 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 24403932 | |
| Uppercase Letter | 90687 | 0.4% |
| Other Punctuation | 45344 | 0.2% |
| Space Separator | 5 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 7262282 | |
| 1 | 5595137 | |
| 2 | 4437861 | |
| 6 | 3498236 | |
| 7 | 1798715 | 7.4% |
| 3 | 404160 | 1.7% |
| 5 | 379578 | 1.6% |
| 9 | 365010 | 1.5% |
| 8 | 349001 | 1.4% |
| 4 | 313952 | 1.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 45344 | |
| S | 45343 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 45344 |
Space Separator
| Value | Count | Frequency (%) |
| 5 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 24449281 | |
| Latin | 90687 | 0.4% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 7262282 | |
| 1 | 5595137 | |
| 2 | 4437861 | |
| 6 | 3498236 | |
| 7 | 1798715 | 7.4% |
| 3 | 404160 | 1.7% |
| 5 | 379578 | 1.6% |
| 9 | 365010 | 1.5% |
| 8 | 349001 | 1.4% |
| 4 | 313952 | 1.3% |
| Other values (2) | 45349 | 0.2% |
Latin
| Value | Count | Frequency (%) |
| D | 45344 | |
| S | 45343 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 24539968 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 7262282 | |
| 1 | 5595137 | |
| 2 | 4437861 | |
| 6 | 3498236 | |
| 7 | 1798715 | 7.3% |
| 3 | 404160 | 1.6% |
| 5 | 379578 | 1.5% |
| 9 | 365010 | 1.5% |
| 8 | 349001 | 1.4% |
| 4 | 313952 | 1.3% |
| Other values (4) | 136036 | 0.6% |
gender
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 414269 |
| Missing (%) | 13.4% |
| Memory size | 47.2 MiB |
| M | |
|---|---|
| F | |
| X | 1610 |
| U | 467 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2682044 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | F |
| 3rd row | M |
| 4th row | F |
| 5th row | F |
Common Values
| Value | Count | Frequency (%) |
| M | 1377224 | |
| F | 1302743 | |
| X | 1610 | 0.1% |
| U | 467 | < 0.1% |
| (Missing) | 414269 | 13.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| m | 1377224 | |
| f | 1302743 | |
| x | 1610 | 0.1% |
| u | 467 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 1377224 | |
| F | 1302743 | |
| X | 1610 | 0.1% |
| U | 467 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2682044 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 1377224 | |
| F | 1302743 | |
| X | 1610 | 0.1% |
| U | 467 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2682044 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| M | 1377224 | |
| F | 1302743 | |
| X | 1610 | 0.1% |
| U | 467 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2682044 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| M | 1377224 | |
| F | 1302743 | |
| X | 1610 | 0.1% |
| U | 467 | < 0.1% |
| Distinct | 1913 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 2982605 |
| Missing (%) | 96.3% |
| Memory size | 47.2 MiB |
| 3692 | 2155 |
|---|---|
| 3697 | 2033 |
| 3703 | 1986 |
| 3893 | 1866 |
| 3661 | 1820 |
| Other values (1908) |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 3.9993932 |
| Min length | 1 |
Characters and Unicode
| Total characters | 454763 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 744 ? |
|---|---|
| Unique (%) | 0.7% |
Sample
| 1st row | 7805 |
|---|---|
| 2nd row | 7805 |
| 3rd row | 5113 |
| 4th row | 5113 |
| 5th row | 5057 |
Common Values
| Value | Count | Frequency (%) |
| 3692 | 2155 | 0.1% |
| 3697 | 2033 | 0.1% |
| 3703 | 1986 | 0.1% |
| 3893 | 1866 | 0.1% |
| 3661 | 1820 | 0.1% |
| 3693 | 1690 | 0.1% |
| 3939 | 1680 | 0.1% |
| 3672 | 1678 | 0.1% |
| 3882 | 1673 | 0.1% |
| 3943 | 1662 | 0.1% |
| Other values (1903) | 95465 | 3.1% |
| (Missing) | 2982605 |
Length
| Value | Count | Frequency (%) |
| 3692 | 2155 | 1.9% |
| 3697 | 2033 | 1.8% |
| 3703 | 1986 | 1.7% |
| 3893 | 1866 | 1.6% |
| 3661 | 1820 | 1.6% |
| 3693 | 1690 | 1.5% |
| 3939 | 1680 | 1.5% |
| 3672 | 1678 | 1.5% |
| 3882 | 1673 | 1.5% |
| 3943 | 1662 | 1.5% |
| Other values (1903) | 95465 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 134774 | |
| 9 | 67871 | |
| 6 | 55410 | |
| 8 | 45323 | 10.0% |
| 7 | 34317 | 7.5% |
| 4 | 30600 | 6.7% |
| 5 | 28628 | 6.3% |
| 0 | 22052 | 4.8% |
| 2 | 19847 | 4.4% |
| 1 | 14773 | 3.2% |
| Other values (24) | 1168 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 453595 | |
| Uppercase Letter | 1168 | 0.3% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 254 | |
| K | 254 | |
| U | 248 | |
| T | 45 | 3.9% |
| A | 35 | 3.0% |
| M | 32 | 2.7% |
| G | 32 | 2.7% |
| L | 27 | 2.3% |
| D | 25 | 2.1% |
| J | 24 | 2.1% |
| Other values (14) | 192 |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 134774 | |
| 9 | 67871 | |
| 6 | 55410 | |
| 8 | 45323 | 10.0% |
| 7 | 34317 | 7.6% |
| 4 | 30600 | 6.7% |
| 5 | 28628 | 6.3% |
| 0 | 22052 | 4.9% |
| 2 | 19847 | 4.4% |
| 1 | 14773 | 3.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 453595 | |
| Latin | 1168 | 0.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 254 | |
| K | 254 | |
| U | 248 | |
| T | 45 | 3.9% |
| A | 35 | 3.0% |
| M | 32 | 2.7% |
| G | 32 | 2.7% |
| L | 27 | 2.3% |
| D | 25 | 2.1% |
| J | 24 | 2.1% |
| Other values (14) | 192 |
Common
| Value | Count | Frequency (%) |
| 3 | 134774 | |
| 9 | 67871 | |
| 6 | 55410 | |
| 8 | 45323 | 10.0% |
| 7 | 34317 | 7.6% |
| 4 | 30600 | 6.7% |
| 5 | 28628 | 6.3% |
| 0 | 22052 | 4.9% |
| 2 | 19847 | 4.4% |
| 1 | 14773 | 3.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 454763 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3 | 134774 | |
| 9 | 67871 | |
| 6 | 55410 | |
| 8 | 45323 | 10.0% |
| 7 | 34317 | 7.5% |
| 4 | 30600 | 6.7% |
| 5 | 28628 | 6.3% |
| 0 | 22052 | 4.8% |
| 2 | 19847 | 4.4% |
| 1 | 14773 | 3.2% |
| Other values (24) | 1168 | 0.3% |
| Distinct | 534 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 83627 |
| Missing (%) | 2.7% |
| Memory size | 47.2 MiB |
| AA | |
|---|---|
| UA | |
| DL | |
| BA | |
| LH | 120556 |
| Other values (529) |
Length
| Max length | 3 |
|---|---|
| Median length | 2 |
| Mean length | 2.0144366 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6068865 |
|---|---|
| Distinct characters | 37 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 136 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | AA |
|---|---|
| 2nd row | AA |
| 3rd row | AA |
| 4th row | AA |
| 5th row | DL |
Common Values
| Value | Count | Frequency (%) |
| AA | 310091 | 10.0% |
| UA | 264271 | 8.5% |
| DL | 252526 | 8.2% |
| BA | 190997 | 6.2% |
| LH | 120556 | 3.9% |
| VS | 113384 | 3.7% |
| AF | 81113 | 2.6% |
| KE | 71047 | 2.3% |
| JL | 69075 | 2.2% |
| AM | 60307 | 1.9% |
| Other values (524) | 1479319 | |
| (Missing) | 83627 | 2.7% |
Length
| Value | Count | Frequency (%) |
| aa | 310091 | 10.3% |
| ua | 264271 | 8.8% |
| dl | 252526 | 8.4% |
| ba | 190997 | 6.3% |
| lh | 120556 | 4.0% |
| vs | 113384 | 3.8% |
| af | 81113 | 2.7% |
| ke | 71047 | 2.4% |
| jl | 69075 | 2.3% |
| am | 60307 | 2.0% |
| Other values (522) | 1479319 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 1495745 | |
| L | 591473 | 9.7% |
| U | 350380 | 5.8% |
| B | 326914 | 5.4% |
| D | 311780 | 5.1% |
| K | 270109 | 4.5% |
| S | 265466 | 4.4% |
| V | 224290 | 3.7% |
| E | 210965 | 3.5% |
| H | 205568 | 3.4% |
| Other values (27) | 1816175 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 5885564 | |
| Decimal Number | 175945 | 2.9% |
| Other Punctuation | 7356 | 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 1495745 | |
| L | 591473 | 10.0% |
| U | 350380 | 6.0% |
| B | 326914 | 5.6% |
| D | 311780 | 5.3% |
| K | 270109 | 4.6% |
| S | 265466 | 4.5% |
| V | 224290 | 3.8% |
| E | 210965 | 3.6% |
| H | 205568 | 3.5% |
| Other values (16) | 1632874 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 63988 | |
| 6 | 49589 | |
| 7 | 23902 | 13.6% |
| 3 | 22990 | 13.1% |
| 9 | 7070 | 4.0% |
| 2 | 5038 | 2.9% |
| 5 | 1928 | 1.1% |
| 8 | 864 | 0.5% |
| 0 | 500 | 0.3% |
| 1 | 76 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| * | 7356 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5885564 | |
| Common | 183301 | 3.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 1495745 | |
| L | 591473 | 10.0% |
| U | 350380 | 6.0% |
| B | 326914 | 5.6% |
| D | 311780 | 5.3% |
| K | 270109 | 4.6% |
| S | 265466 | 4.5% |
| V | 224290 | 3.8% |
| E | 210965 | 3.6% |
| H | 205568 | 3.5% |
| Other values (16) | 1632874 |
Common
| Value | Count | Frequency (%) |
| 4 | 63988 | |
| 6 | 49589 | |
| 7 | 23902 | 13.0% |
| 3 | 22990 | 12.5% |
| * | 7356 | 4.0% |
| 9 | 7070 | 3.9% |
| 2 | 5038 | 2.7% |
| 5 | 1928 | 1.1% |
| 8 | 864 | 0.5% |
| 0 | 500 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6068865 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 1495745 | |
| L | 591473 | 9.7% |
| U | 350380 | 5.8% |
| B | 326914 | 5.4% |
| D | 311780 | 5.1% |
| K | 270109 | 4.5% |
| S | 265466 | 4.4% |
| V | 224290 | 3.7% |
| E | 210965 | 3.5% |
| H | 205568 | 3.4% |
| Other values (27) | 1816175 |
admnum
Real number (ℝ)
| Distinct | 3075579 |
|---|---|
| Distinct (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.082885 × 1010 |
| Minimum | 0 |
|---|---|
| Maximum | 9.9915566 × 1010 |
| Zeros | 68 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4.6608464 × 1010 |
| Q1 | 5.6035228 × 1010 |
| median | 5.9360939 × 1010 |
| Q3 | 9.350987 × 1010 |
| 95-th percentile | 9.4729123 × 1010 |
| Maximum | 9.9915566 × 1010 |
| Range | 9.9915566 × 1010 |
| Interquartile range (IQR) | 3.7474641 × 1010 |
Descriptive statistics
| Standard deviation | 2.2154416 × 1010 |
|---|---|
| Coefficient of variation (CV) | 0.31278802 |
| Kurtosis | 0.53739406 |
| Mean | 7.082885 × 1010 |
| Median Absolute Deviation (MAD) | 3.9545292 × 109 |
| Skewness | -0.63161887 |
| Sum | 2.1930829 × 1017 |
| Variance | 4.9081815 × 1020 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 68 | < 0.1% |
| 7.812054623 × 1010 | 11 | < 0.1% |
| 4.652077483 × 1010 | 9 | < 0.1% |
| 8.924999063 × 1010 | 9 | < 0.1% |
| 8.904084763 × 1010 | 8 | < 0.1% |
| 4.701040863 × 1010 | 8 | < 0.1% |
| 5.603642833 × 1010 | 7 | < 0.1% |
| 8.581902733 × 1010 | 7 | < 0.1% |
| 3.697806763 × 1010 | 7 | < 0.1% |
| 9.322572153 × 1010 | 7 | < 0.1% |
| Other values (3075569) | 3096172 |
| Value | Count | Frequency (%) |
| 0 | 68 | |
| 27 | 1 | < 0.1% |
| 1218224 | 1 | < 0.1% |
| 1219024 | 1 | < 0.1% |
| 1219124 | 1 | < 0.1% |
| 1219224 | 1 | < 0.1% |
| 1219324 | 1 | < 0.1% |
| 1219424 | 1 | < 0.1% |
| 1222424 | 1 | < 0.1% |
| 1226124 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9.991556593 × 1010 | 1 | |
| 9.888880022 × 1010 | 1 | |
| 9.888879932 × 1010 | 1 | |
| 9.888879842 × 1010 | 1 | |
| 9.888879752 × 1010 | 1 | |
| 9.888879662 × 1010 | 1 | |
| 9.888879572 × 1010 | 1 | |
| 9.888878742 × 1010 | 1 | |
| 9.888878652 × 1010 | 1 | |
| 9.888878562 × 1010 | 1 |
fltno
Categorical
| Distinct | 7152 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 19549 |
| Missing (%) | 0.6% |
| Memory size | 47.2 MiB |
| LAND | 44297 |
|---|---|
| 00006 | 30942 |
| 00001 | 29487 |
| 00007 | 23999 |
| 00008 | 22783 |
| Other values (7147) |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 4.9371242 |
| Min length | 1 |
Characters and Unicode
| Total characters | 15190366 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1310 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 00262 |
|---|---|
| 2nd row | 00262 |
| 3rd row | 00262 |
| 4th row | 00262 |
| 5th row | 00188 |
Common Values
| Value | Count | Frequency (%) |
| LAND | 44297 | 1.4% |
| 00006 | 30942 | 1.0% |
| 00001 | 29487 | 1.0% |
| 00007 | 23999 | 0.8% |
| 00008 | 22783 | 0.7% |
| 00003 | 21458 | 0.7% |
| 00011 | 20238 | 0.7% |
| 00005 | 20106 | 0.6% |
| 00012 | 18992 | 0.6% |
| 00015 | 18000 | 0.6% |
| Other values (7142) | 2826462 | |
| (Missing) | 19549 | 0.6% |
Length
| Value | Count | Frequency (%) |
| land | 44297 | 1.4% |
| 00006 | 30942 | 1.0% |
| 00001 | 29487 | 1.0% |
| 00007 | 23999 | 0.8% |
| 00008 | 22783 | 0.7% |
| 00003 | 21458 | 0.7% |
| 00011 | 20238 | 0.7% |
| 00005 | 20106 | 0.7% |
| 00012 | 18992 | 0.6% |
| 00015 | 18000 | 0.6% |
| Other values (7142) | 2826463 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 7229357 | |
| 1 | 1321534 | 8.7% |
| 2 | 1111407 | 7.3% |
| 4 | 851247 | 5.6% |
| 8 | 822253 | 5.4% |
| 7 | 777395 | 5.1% |
| 9 | 730173 | 4.8% |
| 6 | 726262 | 4.8% |
| 5 | 707330 | 4.7% |
| 3 | 674832 | 4.4% |
| Other values (28) | 238576 | 1.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 14951790 | |
| Uppercase Letter | 238571 | 1.6% |
| Dash Punctuation | 4 | < 0.1% |
| Space Separator | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 58858 | |
| N | 53628 | |
| D | 50442 | |
| L | 44963 | |
| M | 7471 | 3.1% |
| C | 5530 | 2.3% |
| X | 2236 | 0.9% |
| B | 1718 | 0.7% |
| R | 1551 | 0.7% |
| S | 1538 | 0.6% |
| Other values (16) | 10636 | 4.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 7229357 | |
| 1 | 1321534 | 8.8% |
| 2 | 1111407 | 7.4% |
| 4 | 851247 | 5.7% |
| 8 | 822253 | 5.5% |
| 7 | 777395 | 5.2% |
| 9 | 730173 | 4.9% |
| 6 | 726262 | 4.9% |
| 5 | 707330 | 4.7% |
| 3 | 674832 | 4.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4 |
Space Separator
| Value | Count | Frequency (%) |
| 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 14951795 | |
| Latin | 238571 | 1.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 58858 | |
| N | 53628 | |
| D | 50442 | |
| L | 44963 | |
| M | 7471 | 3.1% |
| C | 5530 | 2.3% |
| X | 2236 | 0.9% |
| B | 1718 | 0.7% |
| R | 1551 | 0.7% |
| S | 1538 | 0.6% |
| Other values (16) | 10636 | 4.5% |
Common
| Value | Count | Frequency (%) |
| 0 | 7229357 | |
| 1 | 1321534 | 8.8% |
| 2 | 1111407 | 7.4% |
| 4 | 851247 | 5.7% |
| 8 | 822253 | 5.5% |
| 7 | 777395 | 5.2% |
| 9 | 730173 | 4.9% |
| 6 | 726262 | 4.9% |
| 5 | 707330 | 4.7% |
| 3 | 674832 | 4.5% |
| Other values (2) | 5 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15190366 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 7229357 | |
| 1 | 1321534 | 8.7% |
| 2 | 1111407 | 7.3% |
| 4 | 851247 | 5.6% |
| 8 | 822253 | 5.4% |
| 7 | 777395 | 5.1% |
| 9 | 730173 | 4.8% |
| 6 | 726262 | 4.8% |
| 5 | 707330 | 4.7% |
| 3 | 674832 | 4.4% |
| Other values (28) | 238576 | 1.6% |
visatype
Categorical
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.2 MiB |
| WT | |
|---|---|
| B2 | |
| WB | |
| B1 | |
| GMT | 89133 |
| Other values (12) | 84831 |
Length
| Max length | 3 |
|---|---|
| Median length | 2 |
| Mean length | 2.0278163 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6278754 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | B2 |
|---|---|
| 2nd row | B2 |
| 3rd row | B2 |
| 4th row | B1 |
| 5th row | B2 |
Common Values
| Value | Count | Frequency (%) |
| WT | 1309059 | |
| B2 | 1117897 | |
| WB | 282983 | 9.1% |
| B1 | 212410 | 6.9% |
| GMT | 89133 | 2.9% |
| F1 | 39016 | 1.3% |
| E2 | 19383 | 0.6% |
| CP | 14758 | 0.5% |
| E1 | 3743 | 0.1% |
| I | 3176 | 0.1% |
| Other values (7) | 4755 | 0.2% |
Length
| Value | Count | Frequency (%) |
| wt | 1309059 | |
| b2 | 1117897 | |
| wb | 282983 | 9.1% |
| b1 | 212410 | 6.9% |
| gmt | 89133 | 2.9% |
| f1 | 39016 | 1.3% |
| e2 | 19383 | 0.6% |
| cp | 14758 | 0.5% |
| e1 | 3743 | 0.1% |
| i | 3176 | 0.1% |
| Other values (7) | 4755 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| B | 1613451 | |
| W | 1592042 | |
| T | 1398192 | |
| 2 | 1140313 | |
| 1 | 256720 | 4.1% |
| M | 90649 | 1.4% |
| G | 89283 | 1.4% |
| F | 42000 | 0.7% |
| E | 23126 | 0.4% |
| P | 14779 | 0.2% |
| Other values (4) | 18199 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 4881721 | |
| Decimal Number | 1397033 | 22.3% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 1613451 | |
| W | 1592042 | |
| T | 1398192 | |
| M | 90649 | 1.9% |
| G | 89283 | 1.8% |
| F | 42000 | 0.9% |
| E | 23126 | 0.5% |
| P | 14779 | 0.3% |
| C | 14768 | 0.3% |
| I | 3410 | 0.1% |
| Other values (2) | 21 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1140313 | |
| 1 | 256720 | 18.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4881721 | |
| Common | 1397033 | 22.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| B | 1613451 | |
| W | 1592042 | |
| T | 1398192 | |
| M | 90649 | 1.9% |
| G | 89283 | 1.8% |
| F | 42000 | 0.9% |
| E | 23126 | 0.5% |
| P | 14779 | 0.3% |
| C | 14768 | 0.3% |
| I | 3410 | 0.1% |
| Other values (2) | 21 | < 0.1% |
Common
| Value | Count | Frequency (%) |
| 2 | 1140313 | |
| 1 | 256720 | 18.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6278754 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| B | 1613451 | |
| W | 1592042 | |
| T | 1398192 | |
| 2 | 1140313 | |
| 1 | 256720 | 4.1% |
| M | 90649 | 1.4% |
| G | 89283 | 1.4% |
| F | 42000 | 0.7% |
| E | 23126 | 0.4% |
| P | 14779 | 0.2% |
| Other values (4) | 18199 | 0.3% |
updated_at
Date
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.2 MiB |
| Minimum | 2022-12-10 21:24:30.135479 |
|---|---|
| Maximum | 2022-12-10 21:24:30.135479 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.| cicid | i94yr | i94mon | i94cit | i94res | i94port | arrdate | i94mode | i94addr | depdate | i94bir | i94visa | count | dtadfile | visapost | occup | entdepa | entdepd | entdepu | matflag | biryear | dtaddto | gender | insnum | airline | admnum | fltno | visatype | updated_at | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4904480.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | VA | 20583.0 | 48.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | O | None | M | 1968.0 | 10252016 | M | None | AA | 9.462202e+10 | 00262 | B2 | 2022-12-10 21:24:30.135479 |
| 1 | 4904481.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | VA | 20583.0 | 45.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | O | None | M | 1971.0 | 10252016 | F | None | AA | 9.462196e+10 | 00262 | B2 | 2022-12-10 21:24:30.135479 |
| 2 | 4904482.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | VA | 20583.0 | 20.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | O | None | M | 1996.0 | 10252016 | M | None | AA | 9.462200e+10 | 00262 | B2 | 2022-12-10 21:24:30.135479 |
| 3 | 4904483.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | WA | 20580.0 | 39.0 | 1.0 | 1.0 | 20160426 | BEJ | None | G | O | None | M | 1977.0 | 10252016 | F | None | AA | 9.462163e+10 | 00262 | B1 | 2022-12-10 21:24:30.135479 |
| 4 | 4904490.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | DE | 20595.0 | 53.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | O | None | M | 1963.0 | 10252016 | F | None | DL | 9.461481e+10 | 00188 | B2 | 2022-12-10 21:24:30.135479 |
| 5 | 4904491.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | LA | 20645.0 | 50.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | O | None | M | 1966.0 | 10252016 | F | None | DL | 9.461285e+10 | 00188 | B2 | 2022-12-10 21:24:30.135479 |
| 6 | 4904492.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | ME | 20586.0 | 50.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | O | None | M | 1966.0 | 10252016 | M | None | DL | 9.461457e+10 | 00188 | B2 | 2022-12-10 21:24:30.135479 |
| 7 | 4904493.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | MI | NaN | 59.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | None | None | None | 1957.0 | 10252016 | F | None | DL | 9.459601e+10 | 00582 | B2 | 2022-12-10 21:24:30.135479 |
| 8 | 4904494.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | MI | NaN | 31.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | None | None | None | 1985.0 | 10252016 | F | None | DL | 9.459587e+10 | 00582 | B2 | 2022-12-10 21:24:30.135479 |
| 9 | 4904495.0 | 2016.0 | 4.0 | 245.0 | 245.0 | CHI | 20570.0 | 1.0 | MI | NaN | 1.0 | 2.0 | 1.0 | 20160426 | BEJ | None | G | None | None | None | 2015.0 | 10252016 | F | None | DL | 9.459593e+10 | 00582 | B2 | 2022-12-10 21:24:30.135479 |
| cicid | i94yr | i94mon | i94cit | i94res | i94port | arrdate | i94mode | i94addr | depdate | i94bir | i94visa | count | dtadfile | visapost | occup | entdepa | entdepd | entdepu | matflag | biryear | dtaddto | gender | insnum | airline | admnum | fltno | visatype | updated_at | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 220150 | 4487645.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHI | 20568.0 | 1.0 | TX | 20573.0 | 37.0 | 1.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1979.0 | 07222016 | M | None | AA | 5.918488e+10 | 00741 | WB | 2022-12-10 21:24:30.135479 |
| 220151 | 4487646.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHI | 20568.0 | 1.0 | TX | 20574.0 | 40.0 | 2.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1976.0 | 07222016 | F | None | AA | 5.918871e+10 | 00719 | WT | 2022-12-10 21:24:30.135479 |
| 220152 | 4487647.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHI | 20568.0 | 1.0 | TX | 20574.0 | 34.0 | 2.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1982.0 | 07222016 | M | None | AA | 5.918964e+10 | 00719 | WT | 2022-12-10 21:24:30.135479 |
| 220153 | 4487648.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHI | 20568.0 | 1.0 | TX | 20574.0 | 34.0 | 1.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1982.0 | 07222016 | M | None | AA | 5.918990e+10 | 00719 | WB | 2022-12-10 21:24:30.135479 |
| 220154 | 4487649.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHI | 20568.0 | 1.0 | TX | 20579.0 | 27.0 | 1.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1989.0 | 07222016 | M | None | AA | 5.917996e+10 | 00701 | WB | 2022-12-10 21:24:30.135479 |
| 220155 | 4487650.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHI | 20568.0 | 1.0 | US | 20574.0 | 25.0 | 1.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1991.0 | 07222016 | F | None | AA | 5.918947e+10 | 00719 | WB | 2022-12-10 21:24:30.135479 |
| 220156 | 4487651.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHI | 20568.0 | 1.0 | VA | 20574.0 | 40.0 | 1.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1976.0 | 07222016 | M | None | AA | 5.918863e+10 | 00719 | WB | 2022-12-10 21:24:30.135479 |
| 220157 | 4487652.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHO | 20568.0 | 1.0 | AZ | 20578.0 | 45.0 | 2.0 | 1.0 | 20160424 | None | None | G | O | None | M | 1971.0 | 07222016 | M | None | BA | 5.921783e+10 | 00289 | WT | 2022-12-10 21:24:30.135479 |
| 220158 | 4487653.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHO | 20568.0 | 1.0 | AZ | 20581.0 | 35.0 | 2.0 | 1.0 | 20160424 | None | None | G | Q | None | M | 1981.0 | 07222016 | M | None | BA | 5.921791e+10 | 00289 | WT | 2022-12-10 21:24:30.135479 |
| 220159 | 4487654.0 | 2016.0 | 4.0 | 117.0 | 117.0 | PHO | 20568.0 | 1.0 | AZ | 20581.0 | 32.0 | 2.0 | 1.0 | 20160424 | None | None | G | Q | None | M | 1984.0 | 07222016 | M | None | BA | 5.921792e+10 | 00289 | WT | 2022-12-10 21:24:30.135479 |